HMM-based automatic eye-blink synthesis from speech
نویسندگان
چکیده
In this paper we present a novel technique to automatically synthesise eye blinking from a speech signal. Animating the eyes of a talking head is important as they are a major focus of attention during interaction. The developed system predicts eye blinks from the speech signal and generates animation trajectories automatically employing a ”Trajectory Hidden Markov Model”. The evaluation of the recognition performance showed that the timing of blinking can be predicted from speech with an F-score value upwards of 52%, which is well above chance. Additionally, a preliminary perceptual evaluation was conducted, that confirmed that adding eye blinking significantly improves the perception the character. Finally it showed that the speech synchronised synthesised blinks outperform random blinking in naturalness ratings.
منابع مشابه
A Fast, Robust, Automatic Blink Detector
Introduction “Blink” is defined as closing and opening of the eyes in a small duration of time. In this study, we aimed to introduce a fast, robust, vision-based approach for blink detection. Materials and Methods This approach consists of two steps. In the first step, the subject’s face is localized every second and with the first blink, the system detects the eye’s location and creates an ope...
متن کاملAutomatic Sentence Selection from Speech Corpora Including Diverse Speech for Improved HMM-TTS Synthesis Quality
Using publicly available audiobooks for HMM-TTS poses new challenges. This paper addresses the issue of diverse speech in audiobooks. The aim is to identify diverse speech likely to have a negative effect on HMM-TTS quality. Manual removal of diverse speech was found to yield better synthesis quality despite halving the training corpus. To handle large amounts of data an automatic approach is p...
متن کاملAutomatic Segmentation Combining and Spectral Boundary
Currently, AT&T Labs’ Natural Voices multilingual TTS system produces high-quality synthetic speech with a largescale speech corpus [1]. In the development of such systems, automatic segmentation constitutes a major component technology. The prevalent approach for automatic segmentation in speech synthesis is Hidden Markov Model (HMM) based. Even though an HMM-based approach is the most automat...
متن کاملImproved Automatic Extraction of Generation Process Model Commands and Its use for Generating Fundamental Frequency Contours for Training HMM-based Speech Synthesis
Generation process model of fundamental frequency (F0) contours can well represent F0 movements of speech keeping a clear relation with linguistic information of utterances. Therefore, by using the model, improvement of HMM-based speech synthesis is expected. One of major problems preventing the use of the model is that the performance of automatic extraction of the model parameters from observ...
متن کاملSimultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
In this paper, we describe an HMM-based speech synthesis system in which spectrum, pitch and state duration are modeled simultaneously in a unified framework of HMM. In the system, pitch and state duration are modeled by multi-space probability distribution HMMs and multi-dimensional Gaussian distributions, respectively. The distributions for spectral parameter, pitch parameter and the state du...
متن کامل